-
-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve SSR raymarching performance #99693
base: master
Are you sure you want to change the base?
Conversation
The white highlight beneath the cube seems to be accentuated by the new algorithm. Is it possible to turn that down to a similar level as before where it wasn't as obvious? Perhaps the reflection is just mistakenly offset by a few pixels and that's what's causing the highlight look |
Just fixed the white contact line. |
@RPicster thanks for testing. Can you share the project files ? Also this PR is mostly focused on performance, so don't expect quality improvements in most cases. |
ssr-test.zip |
I think the issue you're seeing is actually from a quality improvement of this PR, resulting in more accurate reflections, while current SSR would tend to elongate things, filling those gaps you're seeing. The effect can be seen in the before and after screen shots of the PR |
This PR brings a major rewrite of the Screen Space Reflection raymarching code, targetting performance optimization :
Implements a DDA algorithm that marches the ray simultaneously in ndc and homogeneous view space, as described in "Efficient GPU Screen-Space Ray Tracing" (Morgan McGuire and al.).
Produces a linear depth buffer during the scale pre-pass (this was actually already the case for the single-eye setup, but not for VR). In conjunction with homogeneous view space marching, this removes the need for any reprojection in the ray marching loop.
Removes normal-roughness buffer fetches during marching, utilized to perform backface culling. This is now performed by comparing the current and previous samples' depth and rejecting hits when the ray exits the volume.
Solves 2 issues :
Projection::get_z_far()
orProjection::is_orthogonal()
). These can break under certain circumstances, typically when the zfar / znear ratio is very large, the projection matrix becomes infinite and it's not possible to extract zfar anymore from itHopefully improves code readability and establishes a good foundation for further improvements. A few ideas I leave to further PRs :
Visual differences
Cube roughness is 0.2, floor roughness is 0.0.
Depth threshold is 0.1.
Raymarching 512 steps.
This is the single-eye case. Any help to test it in VR is welcome.
Also, any test with more complex scenes would be appreciated.
Performance improvements
These should be material for both single-eye and VR setups. Although I couldn't get it statistically measured (I couldn't sort out yet the render graph messing up debug markers, despite active support from @clayjohn @Ansraer and @DarioSamo), the GPU traces below show clues of a ~20% compute time reduction.
In this context, please take the below with a grain of salt as it is my interpretation (also please ignore the markers) :
Any help on making this analysis stronger is welcome.
Top chart : before
Bottom chart : with this PR